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ORACLE APPROACH, ADAPTATION AND INDEPENDENCE STRUCTURE 

By Oleg Lepski 
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The paper deals with the density estimation on R d under sup- 
norm loss. We provide with fully data-driven estimation procedure 
and establish for it so called sup-norm oracle inequality. The pro- 
posed estimator allows to take into account not only approximation 
| properties of the underlying density but eventual independence struc- 

ture as well. Our results contain, as a particular case, the complete 
■ solution of the bandwidth selection problem in multivariate density 

, model. Usefulness of the developed approach is illustrated by appli- 

cation to adaptive estimation over anisotropic Nikolskii classes. 

(N 

£-H ! !• Introduction. Let ($7,21, P) be a complete probability space and let X{ = (X\ : i, • • . X^j), 

C/3 " i > 1, be the sequence of Revalued i.i.d. random variables denned on (Q,2l, P) and having the 

J ■ density / with respect to lebesgue measure. Furthermore, denotes the probability law of X^> = 

[Xi, . . . n G N* and is the mathematical expectation with respect to Py\ 

The objective is to estimate the density / and the quality of any estimation procedure, i.e. 
X( n )-measurable mapping f n : ~R d — > Li(R rf ), is measured by sup-norm risk given by 

7— I 1 

£>'. ^ ) (/>/) = K n) ll/n-/L)% ?>i- 

O; 

C**"* \ It is well-known that even asymptotically (n — > oo) the quality of estimation given by R n heavily 



(N 



depends on the dimension d. However, this asymptotics can be essentially improved if the underlying 
density possesses some special structure. Let us briefly discuss one of these possibilities which will 
be exploited in the sequel. 

Introduce the following notations. Let be the set of all subsets of {1, . . . , d}. For any I £ T<i 
denote x\ = {xj 6 1, j 6 I}, I = {1, . . . , d} \ I and let |I| = card(I). Moreover for any function 
r> . g : R' 1 ' — > R we denote ||<?||i,oo = snp^eRi 1 ! b( x i)l- Define also 



fl{xi) = [ f(x)d Xi , Xl ERl'l. 



In accordance with this definition we put fi = 1, I = 0. As we see fi is the marginal density of 
X\ \ := {Xj t i, j G I}. Denote by *P the set of all partitions of {1, ... ,d} completed by empty set 
and we will use for { 1, . . . , d} . For any density / let 



W) = {re^: fix) = n VxeR d | 
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First we note that / = /g and, therefore <£(/) / since G <£(/) for any /. Next, if P G <£(/) then 
G V} are independent random vectors. At last, if X\ \, . . . X^i, are independent random 
variables then obviously *$(/) = *p. 

Suppose now that there exists P ^ such that "P G ^P(/)- If this partition is known we can 

(n) 

proceed as follows. For any I G basing on observation Xj we estimate first the marginal 
density /i by fi sU and then construct the estimator for joint density / as 

fn(x)= J I>n {xi) . 
lev 

One can expect (and we will see that our conjecture is true) that quality of estimation provided by 
this estimator will correspond not to the dimension d but to so-called effective dimension, which in 
our case is defined as d(P) = sup Jg p The main difficulty we meet trying to realize the latter 
construction is that the knowledge of P is not available. Moreover, our structural hypothesis cannot 
be true in general, that is expressed formally by = {0}. So, one of the problem we address in 
the present paper consists in adaptation to unknown configuration P G 

We note however that even if P is known, for instance, P = the quality of an estimation 
procedure depends often on approximation properties of / or {//, n , / G P}. So, our second goal 
is to construct an estimator which would mimic an estimator corresponding to the minimal, and 
therefore unknown, approximation error. Using modern statistical language our goal here is to 
mimic an oracle. It is important to emphasize that we would like to solve both aforementioned 
problem simultaneously. Let us now proceed with detailed consideration. 

Collection of estimators. Let K : R — > R be a given function satisfying the following assumption. 
Assumption 1. fK = 1, < oo , supp(K) C [-1/2,1/2] , K is symmetric, and 

3L>0: \K(t) -K(s)| < L\t-s\, Vt, s G R. 

Put for I G Z d 

K hl (u) = VjfJlKfa/hj), V hl =]]h r 
jei jei 
For two vectors u,v here and later u/v denotes coordinate-vise division. We will use the notation 
Vh = Ilj=i h] instead of Vh x then I = {1, . . . , d}. Denote also k m = ||K|| m , m = {1, oo}. 

For any p > 1 let 7 P : N* x R + — > R + be the function whose explicit expression is given in Section 
2.3 (it has quite cumbersome expression and it is not convenient for us to present it right now). 
Introduce the notations (remind that q is the quantity involved in the definition of the risk) 

H n = {h G (0,l] d : nV h > (a*)" 1 ln(n)}, a* = inf ^(lll,^)]" 2 

leXd 

and for any I G and h G % n consider kernel estimator 

n 

fhi (zi) = n^ 1 ^ K hi i x l,i ~ xi) • 

i=i 

Introduce the family of estimators 

m) = \fhA x ) = II V G % h G Hn\. 
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In particular, f h = n~ l Y^=i Kfi (X% — x) , x £ R d , is the Parzen-Rosenblatt estimator (Parzen 
(1962), Rosenblatt (1956)) with kernel K and multi-bandwidth h. Our goal is to propose a data- 
driven selection from the family 5(^3). 

The estimation of a probability density is the subject of the vast literature. We do not pretend 
here to provide with complete overview and only present the results relevant in context of the con- 
sidered problems. Minimax and minimax adaptive density estimation with L s -risks was considered 
in Bretagnolle and Huber (1979), Ibragimov and Khasminskii (1980, 1981), Devroye and Gyorfi 
(1985), Efroimovich (1986, 2008), Hasminskii and Ibragimov (1990), Donoho et al. (1996), Golubev 
(1992), Kerkyacharian, Picard and Tribouley (1996), Juditsky and Lambert-Lacroix (2004), Rigollet 

(2006) , Mason (2009), Reynaud-Bouret, Rivoirard and Tuleau-Malot (2011) and Akakpo (2012), 
where further references can be found. Oracle inequalities for L s -risks for s = 1 and s = 2 were es- 
tablished in Devroye and Lugosi (1996, 1997, 2001), Massart (2007) [Chapter 7], Samarov and Tsybakov 

(2007) , Rigollet and Tsybakov (2007) and Birge (2008). The last cited paper contains a detailed 
discussion of recent developments in this area. Bandwidth selection problem in the density esti- 
mation on R rf with L s -risks for any 1 < s < oo was studied in Goldenshluger and Lepski (2011). 
The oracle inequalities obtained there were used for deriving adaptive minimax results over the 
collection of anisotropic Nikolskii classes. 

The adaptive estimation under sup-norm loss was initiated in Lepski (1991, 1992) and continued 
in Tsybakov (1998) in the framework of gaussian white noise model. Then, it was developed for 
anisotropic functional classes in Bertin (2005). The adaptive estimation of a probability density on 
R in sup-norm was the subject of recent papers Gine and Nickl (2009, 2010). 

Organization of the paper. In Section 2 we present data-driven selection procedure from J(^P) and 
establish for it sup-norm oracle inequality. Section 3 is devoted to the adaptive estimation over the 
collection of anisotropic Nikolskii classes of functions. The proof of main results are given in Section 
4 and technical lemmas are proven in Appendix. 

2. Oracle inequality. Let V £ *P be fixed and define for any h, r] £ T~L n and any I £ V 

n 

fhi,m = n ~ 1 ^2 i Kh i * Kr n) ( X M ~ x i) > 

i=l 

where [K hl * K m ] = H j£l [K hj * K % ] and [K h . * K Vj ] (z) = J R K h , (u - z)K Vj (u)du, z £ R. 
As we see " *" is the convolution operator on Rl 1 !. Define 



f n = sup sup 
hen n iex d 



n 

— i ^ 

n 

i=l 



, f n = 1 V 2f n 

I.oo 



Let us endow the set ^ with the operation "o" putting for any V,V' £ ^3 

v or' = {i n i' ^ 0, i £ v, i' £ v'} £ ?p. 

Introduce for any h,rj £ % n and any V, V the estimator 

f(h,V),(v,V')( x ) = II fhio,vio{ x ^)^ xeR d . 

Set finally A = sup-p g( p sup Ig p 72g(|I|, koo) and let A = Ad(f n ) ^ / 4 J +1 _ 
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2.1. Selection procedure. For any V G *p and /i G "H n set 



A n (h,V) = sup sup 
and let h and be defined as follows. 



f(h,V),(ri,V>) ~ fv,V 



A n {h,V) + \A n (h,V) = inf inf A n (h,V) + \A n (h,V) 

h£H n V&$ L 
Our final estimator is fj i p(x), x G M. d . 

Existence and measurability. Let us briefly discuss the existence of the proposed estimator as well 
as its the measurability with respect to the u-algebra generated by X^ n \ First, we note that all 
considered in the paper random fields have continuous trajectories on H n x M. d in the topology 
generated by supremum norm. It is guaranteed by Assumption 1. Since H n is totally bounded and 
M. d can be covered by a countable collection of totally bounded sets, any supremum over T~L n x M. d 
of considered random fields will be A( n )-measurable. In particular, f n and 



A n (h,V,V') := sup sup 



f(h,V),(r,,'P') ~ U 



V 



, V,V heH n . 



Since, is finite, we conclude that A n (h,V) is -measurable for any V G *p and any h G rl n . 
Assumption 1 implies also that A n (-,V) and A n (-,V) are continuous on % n for any V. Since rl n 
is a compact subset of M d we conclude that h(V) G % n and -measurable for any P E 
Jennrich (1969), where h(V) = ml h&Hn A n (h,T>) + XA n (h,V) . Since ^3 is finite we conclude 
that (h,V) G rl n x *p is X( n )-measurable. 

2.2. Main result. Let f > be a given number and introduce the following set of densities 



F(f)={/: sup H/ilU < f }. 
{ iei d J 

With any density / G F(f), any h G (0, l] d and I G Id associate the quantity 



/ A%(ti--)[/l(tl)-/l(-)]dti 



1,00 



which can be view as the approximation error of f\. 

For any h G % n and V G % set B(h,V) = sup sup H&ftjHj and introduce the quantity 



v levov 



ln(n) 



m n (f) = inf inf B(h,V) + , 

yJ ' heu„reW)\ ' \] nV(h,V) J 

Theorem 1. Let Assumption 1 be fulfilled. Then for any q>l and any < f < oo there exist 
Ci(q,d, K, fj and C2(q, d, K, f J such that for any f G F(f) and any n> 3 



e /II4p-/ 



19 \ i 



< 



Ci(g,d,K,f)9l n (/) + Ca(g,d,K,f) 



n 



-1/2 



The explicit expression of Ci(g,<i, K,f) and C2 (<?, K, f ) can be found in the proof of the 
theorem. 

4 



Discussion. Let us briefly discuss the assertion of Theorem 1. We start with the following simple 
observation. Let be an arbitrary subset of ^3 containing 0. If our selection rule run *p instead of 
*P then the result of the theorem remains valid if one replaces the quantity 9^ n (/) by 



*(/) 



inf inf 



B(h,V) + 



ln(n) 



nV{h,V) 



where = ^3(/)n^p. The reason of considering *p instead of *p is explained by the fact that the 
cardinality of ^ (Bell number) grows as (d/ ln(d)) d . Therefore, for large dimension our procedure 
is not practically feasible in view of huge amount of comparisons to be done. On the other hand 
if d is large the consideration of all partitions is not reasonable. Indeed, even theoretically the 
best attainable trade-off between approximation and stochastic errors corresponds to the effective 
dimension defined as d*(f) = inf-p e spm sup Ig -p |I|. Of course d*(f) < d but if it is proportional for 
example to d then we will not win much for reasonable sample size. The suitable strategy in the 
case of large dimension consists in considering only partitions satisfying sup Ie -p |I| < do, where do 
is chosen in accordance with d and the number of observation. In particular one can consider 
containing only 2 elements namely and ({1}, {2}, . . . {d}) . It corresponds to the hypotheses that 
we observe vectors with independent components. 

Of course the consideration of instead of ^3 has a price to pay. It is possible that ty(f) H^3 = 
although ^P(/) contains the elements besides 0. However even in this case, where structural hypoth- 
esis fails or is not taken into account (^3 = {0}), our estimator solves completely the bandwidths 
selection problem in multivariate density model under sup-norm loss. 

We finish this discussion with the following remark concerning the proof of Theorem 1. 

Remark 1. Our selection rule is based on computation of upper functions for some special type 
of random processes and the main ingredient of the proof of Theorem 1 is exponential inequality 
related to them. Corresponding results may have an independent interest and Section 4-1 is devoted 
to this topic. In particular the function j p involved in the construction of our selection rule and 
which we present below comes from this consideration. 

2.3. Quantity j p . For any a > 0, p > 1 and s 6 N* introduce 



a ; 



7 p (s, a) = 4eyj2sT p (s, a) [a + (3L/2)(a) s " 1 ] + (16e/3) (s [a + (3L/2)a s_1 ] V 8a) r p (s, 
Tp (s,a) = s(234s5- 2 + 6.5p + 5.5) ln(2) + s(2p + 3) + [l08s5~ 2 | l°g(«)| + 36C S + l][ln(3)] _1 . 
Here 5* is the smallest solution of the equation 87r 2 <5(l + [ln<5] 2 ) = 1, C s = + and 



CP = s sup (T 2 

8>5 t 

CP = s sup 5~ l 

5>5« 



1 + ln 



/ 9216(8 + l)o 2 



i + mf 9216 (;+^ 



+ 1.5 



+ 1.5 



. /4608(s + l)o 2 

log2 H W 

. /4608(s + 1)6 

1o& < { W) 



where (f)(8) = (6/vr 2 )(l + [ln5] 2 ) \ S > 0. 
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3. Adaptive Estimation. In this section we illustrate the use of the oracle inequality proved 
in Theorem 1 for the derivation of adaptive rate optimal density estimators. 

We start with the definition of the anisotropic Nikol'skii class of functions on K s , s > 1, and 
later on ei, . . . e s , denotes the canonical basis in R s . 

Definition 1. Let r = (n, . . . ,r s ),rj G [l,oo], a = (ati, . . . ,a s ), cti > 0, andQ = (Q 1 , . . . ,Q S ), 
Qi > 0. A function g : M s — > R belongs to the anisotropic Nikol'ski class N riS (a, Q) of functions if 



D?g\\n <Qi, Vk = 0,[ai], Vi = l,s; 

D\ aii g(-+te i ) -Dl ai} g(-) < Q l \t\ a '~^ , Vi G R, Vi = T^. 



r, 



Here D\f denotes the kth order partial derivative of f with respect to the variable ti, and [cti\ is 
the largest integer strictly less than Oj. 

The functional classes N rtS (a,Q) were considered in approximation theory by Nikol'skii; see, 
e.g., Nikol'skii (1977). Minimax estimation of densities from the class f$ rjS (a,Q) was considered 
in Ibragimov and Khasminskii (1981). We refer also to Kerkyacharian, Lepski and Picard (2001, 
2007), where the problem of adaptive estimation over a scale of classes N rjS (a, Q) was treated for 
the Gaussian white noise model. 

Our goal now is to introduce the scale of functional classes of (i-variate probability densities taking 
into account the independence structure. It implies in particular that we will need to estimate not 
only the density itself but all marginal densities as well. It is easily seen that if / 6 N Pi d(/3, C) and 
additionally / is compactly supported then fj G N^m for any I £ Id, where C = cC and 

c > is a numerical constant. However if supp(/) = M. d the latter assertion is not true in general. 
The assumption / G N Pt d(/3,£) does not even guarantee that fi is bounded on Rl 1 !. It explains the 
introduction of the following anisotropic classes of densities. 

Let p = (pi, . . . ,Pd),Pi G [l,oo], (5 = (/3i,.. . ,/3 d ), ft > 0, C = (A, . . .,£ d ), C t > 0. 

Definition 2. A probability density f : R d — > R + belongs to the class N Pt d(/3,C) if 

fi€N Wi |i|(A,A), VlGl rf . 

Introduce finally the collection of functional classes taking into account the smoothness of the 
underlying density and the independence structure simultaneously. 

Let (/3,p,V) G (0,oo) d x [l,oo] d x <p and C G (0,oo) d be fixed. Introduce 

N p>d (l3,C,V) = U(x) GN Pid (/3,£) : f(x) = J] VxGl 
^ lev 



For any (l3,p,V) G (0,oo) d x [l,oo] d x <p define 

T(p,p,r) = inf 7 i(/3,p), 71 (/3,P) 



1 ^J'ei p] Pj 



lev uviri v j_ • 

e ei ft- 

We will see that the quantity T(j3,p,V) can be view as "effective smoothness index" related to 
independence structure hypothesis and to the estimation under sup-norm loss. 
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Theorem 2. For any (f3,p,V) G (0,oo) d x [l,oo] d x such that Y((3,p,V) > and any 
C G (0,oo) d 



liminf inf 

In 



sup 

f&i v , d b,C,v) 



E 



(n) 



<PnHP,P,'P)\\fn-f\ 



In n\ 2T+i 

>o, <pMp,v) = ( — 



where T = Y(/3,p, T 7 ) and infimum is taken over all possible estimators. 

Our goal is to prove that the estimation quality provided by on N p ^(/3,£,'P) coincides 

up to numerical constant with optimal decay of minimax risk </? n (/3,p, "P) whenever the value of 
nuisance parameter {/3,p, It means that this estimator is optimally adaptive over the scale 

of considered functional classes. We would like to emphasize that not only the couple ((3,C) is 
unknown that is typical in frameworks of adaptive estimation but also the index p of norms where 
the smoothness is measured. At last, our estimator adapts automatically to unknown independence 
structure. 

Theorem 3. Let K satisfy Assumption 1 and suppose additionally that for some b > 2 



(3.1) 



u m K(u)du = 0, Vm = 2,b. 



Then for any (P,p,V) G (0, b] d x [l,oo] d x <p such that T(p,p,V) > and any C G (0, oo) 



lim sup sup 



E 



(n) 



< oo. 



We want to emphasize that the extra-parameter b can be arbitrary but a priory chosen. Note 
that the condition (3.1) of the theorem is fulfilled with m = 1 as well since K is symmetric. 

We remark also that for any given {(3,p,V) G (0,b] d x [l,oo] d x qj, satisfying T(/3,p,V) > 0, 
one can find f = f((3,p,V) such that / G N P)d (/3, £, V) implies that / G F(f). It makes possible 
the application of Theorem 1. 

4. Proofs. We start this section with the computation of upper functions for kernel estimation 
process being one of main tools in the proof of Theorem 1. 

4.1. Upper functions for kernel estimation process . Let s G N* and let Yj,j > 1, be IR^-valued 
i.i.d. random vectors defined on a complete probability space (0, 21, P) and having the density g 

(n) " (n) 

with respect to the Lebesgue measure. Later on Pg denotes the law of Yi, . . . , Y n , n G N*, and E g 
is mathematical expectation with respect to Pg . 

Let M : K — > K be a given symmetric function and for any r G (0, l] s set as previously 



M r (-) = f[rr 1 M(-/r l ), V r = f[ 



i=i 



i=i 



Denote also m r 



IMI 



m = {1, oo}. For any y G M s consider the family of random fields 



xM = n- 1 Y,[M r {Y j -y) 



E 



(n) 



Mi 



(Yj -y)]}, r £K n (s):={r £(0,l] s : nV r > ln(n)} . 
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For any r £ (0, l) s set G(r) = sup / \M r (x - y)|g(sc)da; and let G(r) = 1 V G{r). 
Proposition 1. Let M satisfy Assumption 1. Then for any n > 3 and any p > 1 



eJ < sup 
I ren n (s) 



Xr 



s, m r 



'G(r)ln(n) 



< ci(p,s)[l VmfUglloo] 2 n 2+c 2 (p,s)n p , 



where ci(p,s) = 2 7p / 2+5 3 p+5s+4 r(p + 1)tt p (s, moo ) and c 2 (p,s) = 2P +1 3 5s . 
The function 7r : N* x M + :— >• R is given by 

tt(s, a) = (x/a V a) ^y/2es [1 + (3L/2)a fi - 2 ] V (2e/3) [l + (3L/2)a s " 2 ] V 8 



In view of trivial inequality 



in - v /Gf(r)ln(n) 

IXrHoo < Jp(s, moo) \ — + | \\Xr 



nV r 



7 P ( 



s,m c 



'G(r)]n(n) 
nV r 



we come to the following corollary of Proposition 1. 

COROLLARY 1. Let M satisfy Assumption 1. Then for any n > 3 and any p > 1 



l 



< [1 VmfUglloo] 2 7 p (s,m 00 ) + {ci(p,s) + c 2 (p,s)} 



-1/2 



r€7e„(s) 

Consider now the following family of random processes: for any i/£K s 

n 

T r (y) = n- 1 ]T \M r (Yj -y)\, re TZ^(s) := {r G (0, l] s : «K > cT 1 ln(n)} , 

where we have put a = [27 P (3,11100)] 2 . 

Proposition 2. Let M satisfy Assumption 1. Then for any n > 3 and any p > 1 

r(n) { sup [1 V||T r || 0O -(3/2)G(r)] } < ci(p, a) [l V mf ||g||oo] ^"f + c 2 (p, s)rT p ; 



El 



E 



(n) 



sup [G(r)-2(lV||T r (-)|U] } < 4(p, «) [l V mf ||g||oo] f n"i + c' 2 (p, s)n 



where c[(p,s) = 2 p c 1 (p,s) and c' 2 (p,s) = 2 2p+1 3 5s . 

4.2. Proof of Theorem 1. We start the proof of the theorem with auxiliary results used in the 
sequel. Whose proofs are given in Appendix. 



4.2.1. Auxiliary results. Introduce the following notations. For any I G Id set 

s hl (-)= [ # hl (*i--)fi(*l)d*i, 4 (■)=/ [K hl *K m ] (ti — -)/i(ti)dfi; 

JIRl 1 ! JRl J l 

Lemma 1. For any I G 2^ and any h, r\ G (0, l]' 1 ' one /&as 

~ s? ?illi,oo - k i &?i i- 

For any h G (0, l] d and any P G *P set 



s n ln(n) 

—— — — -, s„ = lV sup sup 

nV{h,T) heH n i£T d 



\K hl (ti - -)\fi(ti)dti 



I.oo 



Put also ^j(-) = /^(O - s ftl (-) and let 



C(^,^) = SUp||£/J| Ioo , Cn = SUP SUP 

lev t)e«„ Peqj 



C(t?,T') -AA n ( V ,V) , 
J + 

Lemma 2. For any p> 1 there exist Ci(p,d,JZ,f) , i = 1,2,3,4, such that for any n > 3 



(i) 

(ii) 

(iii) 

(iv) 



sup 

/6F(f) 

sup 

/6F(f) 

sup 

/6F(f) 

sup 

/GF(f) 



4 n) (Cn) 2<? 29 <c 1 (2g,d,K,f)n- 1 /2 ; 



E ( f n) [ S - n -f„]^ 29 <c 2 (2g,d,K,f)n" 1 /2. 



E< n) [fn-3 S " n ]^ 29 <C3(2g,d,K,f)n^/2. 



1 2g 



4 n) (fn) P r <C 4 (p,d,K,f). 



The explicit expression of Cjfp, d, K, f) , i = 1, 2, 3, 4 can be found in the proof of the lemma. 

4.2.2. Proof of Theorem 1. We brake the proof on several steps. 

1°. Let h G Tin and "P G *}3 be fixed. We have in view of triangle inequality 



(4.1) 

We have 
(4.2) 



fh,V f 



< 



fh,V f(h,V)(h,V) 



+ 



(h,V)(h,V) 



fh,1 



+ 



fh,T> - f 



Noting that f^far) = f(h,V)(h,T) we S et 



(4.3) 



{h,V){h,V) 



fh.V 



< A n (h,V) +\A n (h,V). 



We obtain from (4.2) and (4.3) 



f h,V f (h 



(h,V)(h,V) 



+ 



f(h,-p){h,V) 

A n (h,V) +XA n (h,V)] + [A n (h,V) + \A n (h,V)] <2^A n (h,V) + XA n (h,V) . 
To get the last inequality we have used the definition of (h^V). Thus, we obtain from (4.1) that 



< 



(4.4) 



fh,V f 



< 2 



A n (h,T) + \A n (h,T) 



+ 



fh,V - f 



2°. Note that for any h,r] £ 7i n and any V' G 



(4-5) f(h,V),(v,V') ~ Ur < d(f n ) ^ /4 J +1 sup Yl fh InI ,, VlnI , ~ f Vv 



I' 67" 



lev. inl'^Q 



I'oo 



Here we have used the trivial inequality: for any m G N* and any aj, bj : Xj — > R, j = 1, m, 

-\ m—l 



(4.6) 



n a 3 - n 6 j 

3=1 3=1 



< m[ sup || cij - OjlU^oo 



j=l,m 



sup max (||aj H&ilUj.oo) 

j=l,m 



where || • ||Af-,oo an d || • ||oo denote the supremum norms on Xj and X\ x • • • x X m respectively. 
Introduce the following notation: for any h,r] G % n and any P 6 !p we set 

= fhi,TRv) — s h 1 ,r) I \) 

We have in view of (4.6) (here and later the product and the supremum over empty set are assumed 
equal to one and to zero respectively) 



(4.7) 



n /w,%m' - n *iw.w - d [ m ax{f n ,k2f}] d 1 sup & InI „ w 



lev 



I'oo 



lev 



Jnl'.oo 



We remark that for any I G X^, any /i, G (0, and any zi G M' 1 ' 



7riii 



and, therefore, 



l^i,mlli,oo - kl11 ' H^ilkoo - kld H&iHl.oo » 
since ki > 1 in view of Assumption 1. It yields together with (4.7) 

< dki d [max{f n ,k?f}] d_1 supJ^ In J ^. 



(4.8) 



II fh In ii,Vinl' II ^inl'fl/m' 



Note also that for any 7] G % n and I' G X^ 



I' oo 



3 %'(") = / , ^w(*r ~ Oil' (*i') d *r = / , ^%'(*r ~ ") II /rni'(*rni') dt r = ]^[ s^ /nI ,( 
jR 1 «' Ljg-p J lev 
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Here we have used that "P G ^P(/)- Using once again (4.6) we obtain 



II S h InI >,ri InI i II s «jm' 



and, therefore, in view of Lemma 1 



< d\k\i] d 1 sup 

I'oo I&P 



Jnl',00 



(4.9) 



lev 



< ^[kffj^supl^ll 
r, 00 lev 



Thus, we obtain from (4.8) and (4.9) 



IT fhlnlhVmV fvi* 
lev. 



<fn 



I' 00 



sup||a inI , 



../nr,oo + ™P||^m' 
lev ' lev 



Jni',00 



where we have put f n = 2dkf [max {f n , kff }] d 

Therefore, we get from (4.5) for any h,rj G "H n and "P' G *p 



(4.10) f(h,v),( v ,v') ~ hr> <u{c{KVoP)+ sup ||^J Ioo ) +fnC(^^')- 

00 ^ leVoV ' ) 

Here we have put f n = d(f n ) ^ ^ 4 J +1 and f n = f n f n Taking into account that for any /i G H n and 
any 7>, V G 

A n (ft, VoV')< A n (h, V) A A n (h, V') , 

we get from (4.10) 



(4.11) 



f(h,v),( v ,-P') - fvV < fn\kA n (h,V) + sup \\b hl \\ Ioo + Cn> +fnC{v,'P')■ 
IeVoV , 



Remembering that A = f n A, we obtain from (4.11) 

A n (h, V)<U AA n (h, V)+B (h, V) + Cn + fn Cn + A sup sup \A n {rj,T > ) — A n {rj,T')\ >, 

rjeHn VeV L J + > 



where, remind B{h,V) = sup-p/ sup Ig:Po -p/ ||&/n|| Ioo - 

Taking into account that f n > f n , since f n > 1 we finally get 



(4.12) A^P) <f^AAi(/»,P) +B(h,V) +2Cn + A sup sup 

«G«„ VeV 



A n (ri,V) - Mr),V)} }. 



Note that the definition of T~L n implies that 



A n (ri,V) - A n (ri,V) 



< a* 



<a*[s n -f n ] Vr]£H n ,VV£V. 



To get the last inequality we have also used that by definition f n ,s n > 1. 
Putting R n = o*A [s n — f n ] we obtain in view of (4.12) 



(4.13) 



A n (h,T) <f n {AA n (h,V) + B(h,T) +2( n + Rn}, 
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Note also that the definition of T~L n implies that 



A n (h,V) - V3A n {h,V) 



<a*[f n -3s n ] V77G^ n , VPGP. 



< O* V f„ - \/3Sn 

Thus, denoting 1Z n = a* A [f n — 3s n ] + we obtain using (4.13) 

(4.14) A n {h,V) + XA n (h,V) <~f n {3AA n (h,V) +B(h,V) +2( n + R n +Tl n }, 

where we have used also \/3 < 2. 

3°. Note that in view of V G <£(/), (4.6) and (4.13) 



fh,V - f 



lev lev 

< d [max{f n ,kff}] d 1 sup f hl (xi) - fi{xi) 



lev 



7,oo 



(4.15) < d[max{f„,k?f}] d 1 [j3(fc, P) + C(fc, P)] < f„ [b(/i, P) + AA n (/i, 7>) + Cr, 

Here we have also used that V = V o V . We obtain from (4.4), (4.14) and (4.15) 



fh3 f 



< f n [3B(h,V) + 7AA n {h,V) + 5Cn + 2Rn + 2TZ n ] , 



and, therefore, for any h G H n , "P G ^3 and g > 1 



( 4 -!6) (e^II^-ZH, 

where we have put for p > 1 



3B(fc,7>) + 7AA n (fr,-p) 



2q 



5yi,n + 2Aa* (y 2 ,n + 2/3,n) 



4 n) (fn) P 



, 2/1,71 



4 n) (Cn) 2<? 



E^ n) [ 



-'9 



, J/3,n 



E^ n) [f n - 3s n 



2q 



Taking into account that the right hand side of (4.16) is independent of the choice h and "P and 
that the quantity s n < 1 V [kif] we get 



E (n) ll ~ ~ 

\\ J h[V],V 



f\ 



< 7AE q [ inf inf 



heH n VeW) 
5yi n + 2Aa* (y 2 . n + 2/3 



B(h,V)+A n (h,V) 



Ci (q, d, K, f) JR(/) + E 2q 5yi,„ + 2Aa* (y 2 , n + y 3 ,„) 



where we have put Ci(g,d,K,f) = 7AE qy /l V [k x f]. 

This inequality together with bounds found in Lemma 2 leads to the assertion of the theorem. 
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4.3. Proof of Theorem 2. The proof of Theorem 2 is relatively standard and based on the general 
result established in Kerkyacharian, Lepski and Picard (2007), Proposition 7. For the convenience 
we formulate this result not in full generality but its version reduced to the considered problem. 
Let (P,p,V) G {0,cc) d x [l,oo] d x <p such that ?(p,p,V) > and C G (0,oo) d be fixed. 

Lemma 3. Assume that there exist /o G N P) d(/3,£,'P), p n > 0, n G N*, and a finite setJ n such 
that for any sufficiently large n G N* one can find {/®, j G J n } C ~Np^{/3,C,V) satisfying 



(4.17) 
(4.18) 

Then for r > 1 



11/ 



U) 



lim supE^ 71 ' 



/o 1 1 oo — Pn j 
1 



Vj G J n ; 



I Jn I .' 



dP 



(-0 
/(j) 



»(«) 
/o 



C < oo. 



lim inf inf 

n— >oo j 



sup 

/eN Pj£i (/3,£,P 



Pn 1 f4 n) ||/-/| 



' > 2- 



1- VC/(C + 4) 



where infimum is taken over all possible estimators. 



Proof of the theorem. Set N(x) = Y\ i= \ (j 27r ]~ ^P~{ x l/ 2 } J and let f Q (x) = o- - W(z/<r), 

where cr > is chosen in such a way that /o belongs to the class N Pi d(/3, £/2) . We remark that in 
order to obey the latter restriction it suffices to choose a satisfying 



(4.19) 



sup sup a 
lex d iel 



i=lA 



The product structure of /o together with (4.19) allows us to assert that /q G N Pt d(/3,C/2,V) for 
any V G ty. Let I* G {1, . . . , d} be defined from the relation 

T(p,p,V) := inf 7l (/3,p) = 7P (/3,p), 

and for the notation convenience the elements of I* will be denoted by i±, . . . , i m and m = \I*\. 

Let g : R -> R be compactly supported on (—1/2, 1/2) function, satisfying g G r\i£i*N Pit i(/3i, 1/2), 
and such that f g = 0. Suppose also that |g(0)| = ||g||oo- 

Let A n — > and 5i >n — > 0, / = 1, m, n — > oo, be sequences whose choice will be done later and 
[1, . . . , Mi, n ] x • • • x [1, . . . , M m>n ] C N m , where M u = [5~„ 1/2 J , Z = T, 



For any j = {j u 
we put = ji5i n . The choice of g implies 



■ jm) G J„ define Gj(xi) = A n rj™iS 



(j) 



m. 

Here for any j G J„ 



(4.20) 



GjGV = 0, Vj,j'G J n , j/j'. 



Note also that the system of equations 
(4-21) A n 5-^(H6, 



\ i/fi 



Z=l 



fc = 1, m, 
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m—1 



implies that Gj 6 N pi> ^Pi,Ci/2) for any j G J,„. Here we have denoted c k = ) 
Introduce the family of functions {/®, j 6 J n } as follows. 

/ Ci) (^) = f[ (M- 1/2 exp-{x?/2a 2 }Vn M-^exp-lo.f^l + G'j^)). 

First we remark that A n — > 0, n — > oo, implies that fv) > for all sufficiently large n. Next, 
the assumption J g = implies that f /W = 1. Thus, /w is a probability density for any j G J„ 
for all sufficiently large n. At last the choice of fo together with (4.21) allows us to assert that 
€N p , d {P,£,V) for anyjG J n . 
Thus, we conclude that Lemma 3 is applicable to the family {/®, j E J n } . We remark also that 



(4.22) 

where we have put c\ = \g(0)\ m I 2it<j 2 



ll/ (j) -/olL = c i^ v j £ J - 

(m-d)/2 



Here we have also used that |g(0)| = ||g||oo- We 
conclude that the assumption (4.17) is fulfilled with p n = c*[A n . 

Let us now proceed with the verification of the condition (4.18) of Lemma 3. Note first that 



dP 



(n) 



£(**") = n 



fo 



J fo(X k ) 



and, therefore, 



(4.23) 



i dP (n) 



IT I 2 



En 



k(x k ) 



+ e n- /(j)(Xfe)/(j,)(Xfc) 



j,j'eJ n : fc=i 



mx k ) 



Since X k , k = 1, n are i.i.d. random vectors, we have for any j ^ j' 



E 



c»)J A/ a) (**)/ a,) (** 



Jo 



n 

fc=i 



1 + 



/i*,o (zi*) 



1 + 



/i*,o (^p) 



/i*,o(2;i*) dx i* f = L 



To get the last equality we have used (4.20) and the fact that L|i*| G$(xi*)dxi* = since f g = 0. 
The latter result together with (4.23) yields 



£ — E (n) 
Ln ■- ^fo 



LI atM V J 

Gj(xp) 



(n) 



E 

jeJ n 



1 + 



(4.24) 





r <3 m l 


/ 


./i*,o(y). 



/p ,0(2:1*) 
dy 



/l*,o(a;i*)dxi* 
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Since, Gj(y) = for any y ^ [0, y^nj x • x [0, 5 mjn ] =: y n we have for all n large enough 
inf ye y n /i* )0 (y) > 2" 1 27TO- 2 . It yields together with (4.23), putting c* 2 = 2 2vrcr 2 ||ff||| m , 



/ m v n 

^ < iJ«i -i (i+^n^») • 

^ z=l ' 



If we choose A n and <5; jn , I = l,m satisfying 

m m 

(4-25) ^n^n^<(l/4)ln(n^n) <1*(|J»I), 

i=i i=i 

for all n > 1 large enough, then £ n < 1 and, therefore, the condition (4.18) is fulfilled with C = 1. 

Thus, we have to choose A n and Si in , I = l,m satisfying (4.21) and (4.25). Let t > be the 
number whose choice will be done later. Consider instead of (4.25) the equation 



(4.26) 



nA 2 n Y[Si >n = t 2 ln(ra). 



l=l 



and solve (4.21) and (4.26). Straightforward computations yield 

-i s-^m 1 



A n = R(st) 1 - J:r = 1 ^rV^, 6 hn = A^ P " P " (fe) 



m [ 1 l \ I -J^ ~q~~ — 



n™i (c;/A) 2ft < i=1 V p ^W. Moreover we have in view of (4.26) 

Em J_ 
1=1 ft, 



m \ -V 2 

3 Iin | = R{et)- a , a 



n*. 



and, therefore, (1/4) In (j$jt x - (o/2)ln(n), n — > oo, Hence, choosing t as an arbitrary 

number satisfying t 2 < (2c2) _1 a we guarantee that (4.26) implies (4.25) for all n large enough. 
Thus, we conclude that Lemma 3 is applicable with 



1 ~ £ '=i ggj7 



1 1 









m 1 

It remains to note that the definition of I* implies that Y({3,p,V) = — =5 — r 1 ^ 1 -. We remark that 



m 1 



2T(/3,p,7>)+l 2 (l-££ 1 
and the assertion of the theorem follows. 



1 1 

Ph 2 



<0 
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4.4. Proof of Theorem 3. The proof of the theorem is based on the application of Theorem 1 and 
on Lemma 4 below that allows us to bound from above the quantity B(h,V). The assertion of the 
lemma, whose proof is postponed to Appendix, is based on the embedding theorem for anisotropic 
Nikolskii spaces. For any function g : IR S — > R and any r\ G (0, oo) s set 

B v>g (z)= [ K v (t-z)g(t)dt-g(z), z€R s . 

Lemma 4. Let K satisfy Assumption 1 and (3.1). Let (a,r) G (0, b] s x [l,oo] s be such that 
k = 1 — Yld=i( a i r i)~ l > an d Q e (0, oo) s . Then there exists c = c(s,r, b) > suc/i i/iai 

s 

sup HS^IL < ckf J^QiT/f*, V?? G (0,oo) s . 

gGN r , s (a,Q) f=1 

.Here a: = (cti, . . .a s ), ati = xotiX^ 1 and Xj = 1 — Xw=i ( r ;~ 1 ~~ r T ) a T 1 ' 

Proof of Theorem 3. Let (P,p,V) G (0, b] d x [1, oo] d x such that T(/3,p,V) > and £ G (0,oo) d 
be fixed. For any I G Id and any i G I define 

/3 i (i) = r(i)Arr 1 (i), rci) = i - 5>iw) -1 , T i (i) = i-^( Pz - 1 -pr 1 )/5r 1 5 

iei iei 
and remark that the condition Y(f3,p,V) > implies that r(I) > for any I G Id- 

Let us first prove the following simple fact. Denote = {J C I : i G J}, i G I. Then 

(4.27) /3 i (I)=inf A(J), Vie I. 

JeCi(i) 

Indeed, we remark that t\{3) = 1 — X^ej {pJ 1 ~ -Pi" 1 )/^" 1 = T (^) + P\ l SzeJ <^z _1 anc ^' therefore, 

^.(j) = — — ; r i (j) = y a- 1 . 

We obviously have t(J) > r(I) and /3 _1 (J) < /3~ 1 (I) for any J C I. It remains to note that 
x \— > x/(x + a) is increasing on M + for any a > and (4.27) follows. 

Let "P' G be an arbitrary partition. Since / G N p ^(/3,£) we have /j G N pjj |j| (/3j, £j) and, 
therefore, in view of Lemma 4 we have for any h G (0, l] d and J G V o V 

b hj <c(|J|, PJ ,b)ki J| ^£ i / l f i(J) < Cl J2£ih? l[l) . 

ieJ iei 

To get the last inequality we use (4.27), h G (0, l] d and we have put ci = kf sup J€ld c(|J|,pj, b)k| J '. 
Noting that the right hand side of the latter inequality is independent on J we obtain 

B(h,V) < cisupV A/if i(I) , h£(0,l] d . 

It remains to choose multi-bandwidth h. To do it it suffices to solve for any I G V the following 
system of equations. 
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The solution is given by 

( 7l(ff,P) 
re ' iei 

Here we have also used that l/7i(/3,p) = ^; gI l//3j(I). The assertion of the theorem follows now 
from Theorem 1. | 



5. Appendix. 

5.1. Proof of Proposition 1. 1°. Note that M(z) = M(|2|) since M is symmetric that implies 

n 

Xr(y) = n _1 £ [Mr{p(Yj,y)) -4 n) {M r (^ i)2 /))}] , 
i=i 

where p : W x K s — > W is given by p(z, z') = (jzi — z[\, . . . , \z s — z' s \). 

We conclude that considered family of random fields obeys the structural assumption introduced 
in Section 4.4. of Lepski (2012), with d = s, Xf = Xf = R s and p l : R x R -»• R is given by |z - z'| 
for any Z = 1, s. It implies in particular that M s is equipped with the metric g s generated by the 
supremum norm, i.e. g s = max ; _j^ pi. We remark also that in our case K(u) = Ylt=i M(itj), it E 
IR S , 3 = 1 and 7/ = 1, 1 = 1, s. 

To get the assertion of Proposition 1 we will apply Theorem 9 in Lepski (2012) on lZ n (s) := 
[1/n, l] s . Note that obviously lZ n C 1Z n (s). Thus, we have to check the assumptions of the latter 
theorem and to match the notations used in the present paper and in Lepski (2012). 

First we note that since M satisfies Assumption 1 Assumption 9 (i) is obviously fulfilled with 
L\ = (3s/2)(m 00 ) s_1 L. Moreover Assumption 9 (ii) holds because g = 1. 
Thus, Assumption 9 is checked. 

Consider the collection of closed cubs Bi (j) = {z € W : Q s (z,j) < 1} , j G Z s , and let 5 > 

denote the metric entropy of Bi (j) measured in the metric g s . 

2 

Obviously |Bi(j), j £ Z d | is a countable cover of ]R S and each member of this collection is 
totally bounded (even compact) subset of W. It is easily seen that 



card k £ Z s : B 1 



■2 



(j)nli(k) ^0}) <3 S , VjGZ s . 



Using the terminology of Lepski (2012) we can say that |Bi (j), j € Z d | is 3 s -totally bounded cover 

of M s . Moreover, = s[ln(l/5)] for any 5 > and any j € Z s . All saying above allows us to 

assert that Assumption 7 (i) is fulfilled with I = Z s , Xj = Bi (j), N = 1.5s and R = 1. It remains 
to note that Assumption 7 (ii) is automatically fulfilled in our case since g = 1. 

Also we note that for any j, k G Z s satisfying B 1 (j) n B 1 (k) = one has 

2 2 

inf inf Q s (x, y) > 1 
zgb 1 (j) yeB 1 (k) 
2 2 
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and, therefore, Assumption 11 is checked with t = 1. At last we have for any n > 1 



sup sup 

r£lZ n (s) ug(0,l] 



\m{ui/ n ) 



o. 



since supp(M) C [—1/2,1/2]. Hence, the condition (4.24) of Theorem 9 is fulfilled as well that 
completes the verification of the assumptions of the theorem. 

2°. Let us match the notations. First, in our case ni = ri2 = n. Since Yj, j > 1, are identically 
distributed the quantity denoted F n2 (r, x^) is given now by G(r, y) = J Rs \M r (x — y)|g(x)dx and, 
therefore, is independent on n. Here we have taken into account that 6 \ d = M s . 

It is easily seen that 



(5.1) 



G n := sup \\G(r, -)||oo < min 

re[l/n,l] s 



millgHocm^n 5 



It yields, in particular, that F n2 = G n < mf ||g||oo for any n > 1. 

Choosing in Theorem 9 g = p, t> = 2p + 2, z = 1 and remembering that = y, we have 



(n, r, rr^) < 7p (a, m*,). 



'G(r)ln(ra) 



for any x^ = y € and any r G 7Z n (s) C 1Z n {s). To get this assertion we have used that 
G n < (moon) 5 in view of (5.1). 

At last, taking into account that the right hand side of the latter inequality is independent on 
y, we deduce from Theorem 9 that for any p > 1 

v 



E 



(n) 



sup 



ML ~ 7p(s,m 00 )- 



'G(r)ln(n) 
nVr 



< ci(p,s)[l V mf ||g||ooJ 2 « 2 +c 2 (p, s)n 



where ci(p,s) = 2 7 f/ 2 + 5 3P +5s+4 r(p + 1) ^(sjiiioo) and C2(p,s) = 2 p+1 3 5s . Here we have also used 
that G n < mf Uglloo in view of (5.1) that implies F n2 < 1 V mf ||g||oo- I 



5.2. Proof of Proposition 2. First, noting that J p [s, moo) = 1/2 we obtain from Proposition 
1 that 

(5-2) 4 n) { sup (\\xr\L - IJGV)) V < c n , 

r l - — £ — 

where we have put for brevity c n = ci(p, s) [1 V mf ||g||ooJ 2 n 2 +C2{p, s)n~ p . Next, putting Xr(y) = 

T r (y) — EgT r (y) we have in view if (5.2) 

(5-3) 4 n) { sup (\\xr\L - IJOir)) V < cn. 

To get the latter result we remarked that if M satisfies Assumption 1 then |M| satisfies it as well 
and, therefore, Proposition 1 is applicable to the process Xr(')- It remains to note that the function 
G(-) is the same for both processes Xr(-) and Xr(-)- We also note that 

G(r) = sup M n) T r (y)| 
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and, therefore, for any r £ (0, l] s one has 
(5.4) G(r) = 1 V 



E^ n) T r 



where we have used the obvious inequality |||x|| V \ \z 
normed vector space. 



Hence, putting ( n (a) = sup r( _ n ^ ){s) 



< IV IIT^ + ||x,|| oc , 

||y|| V ||z|| | < ||x — y\\ being true for any 
II Xr || oo — \ \/G{r) we obtain for any r G 1Zn\s) 



G(r) <\\l 



It yields 



G(r) - 2 (1 V IITVH^) J < 2Cn(a) and we have in view of (5.3) 
(n) { sup [G(r)-2(lV||r r |U]) P <2?c n . 

Similarly to (5.4) we have 

1 V HT^ < G(r) + \\xr\L < (3/2)G(r) + Cn(o) 
and, therefore [l V [|TTt-||oo _ ( 3 / 2 ) G ( r )] + < Cn(a). Thus, we get from (5.3) 



E 



n) 



sup [^^^^-(S/^r)] <c n 



5.3. Proof of Lemma 1. We have in view of Fubini theorem for any x\ £ M 1 
s LrjT (^i) = / [Kh! * (*i - xi)fi(ti)dti = K vi {yi) K h! (*i - xi - 2/i)dyi 



fi{h)dti 



K hl (ti-zi)fi{ti)dti 



Therefore, 



/ K m (zi - xi) 

Sfn (xi) + K m (zi- xi) / K hl (ti - zi) { fi (ti) - fi (zj) } dti 
\\ s hi tV1 - s mllioo - bh i / K m{vi) d ^ ^ k iK- 



dzj. 



5.4. Proof of Lemma 2. The proof of the lemma is completely based on application of Propo- 
sitions 1-2 and Corollary 1. 
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Proof of (i). Remind that C(h,V) = sup H^/iJIj ^ and 

lev 



Cn = sup sup 
ven n Ve<$ 



C(v,V) - AA n (r],V) 



Then, we have 
(5.5) \Ef\Cn) 2q 



2q 



EEk"> 



veyiev 



sup 

»7l6wl 0i) (|I|) 



ll^mlll,oo -72g(|I|,koo), 



' s n ln(n) 



nV r 



m 



2g\ 2q 



where we have put ^ 0i) (|I|) = {m E (0, : nV m > of 1 ln(n)} and ai = [2-ya, (1,1^)] 2 . 

To get the latter result we have used first that A n (j),V) = sup Ig p J Sn ^y^ and the trivial inequality 



[supjXj — supjyj] + < supjxj — yi]+- Next we have used that A = sup-p g <p sup Ig -p 72 9 (|I|, koo) . At 
last we have used that rj E % n implies rfi E %i 0l \|I|) for any I E Id in view of the definition of a*. 
Note that for any for any I E Id and any rj\ E (0, l]' 1 ' 



s > 1 V 



/ |^(*I--)|/l(*l) d *I 
JR 1 



I.oo 



We conclude that Proposition 1 is applicable with %r — 671 > M = K, p = 2q, s = |I|, a = ai, G = F\ 
and the assertion (i) follows with 

Cl (2g, d, K, f) = J2 £ [ Cl ( 2( ?' I 1 ') t 1 V k i I|f ] 9 + C2 ( 2 «' I 1 ')" 
Proof of (ii). Put for any /i E % n and I E Id 



si(hi) 



\K hl (ti--)\fi{ti)dti 



1,00 



fl,n{ h i) = n-^lKh^Xj, 



i=i 



I,oo 



We have similarly to (5.5) [s n — f n ] + < sup sup [si(/ii) — 2fi jn (/ii)] + and hence 



sup 

fciewi 0i) (|i|) 



-2fj, n (/ii) 



2<A 2q 



The assertion (ii) follows now from the second statement of Proposition 2 with 
Ca(2g,d,K,f) = ]T [c[(2q, |I|) [l V kf'f] 9 + c' 2 (2q, |I|) 



Proof of (iii). We have [f n — 3s n ] + < 2sup sup [fi,n(^i) ~~ (3/2)si (^i)]+ and hence 



IeZ ^ lG ^ ai) (|i|) 



E^ n) [f n - 3s r 



l -'i 



< 



2 We 



(n) 
/ 



sup 
hieH ( * l) (\i\\ 



ft.nfa) " (3/2)si(/ii) 



2g\ 2<j 



The assertion (iii) follows now from the first assertion of Proposition 2 with 
c 3 {2q, d, K, f) = 2 £ [ Cl (2g, |I|) [1 V kf'f] " + c 2 (2g, |I|) 
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Proof of (iv) . Note that 

U : = 2d 2 kf(f„)L d2 / 4 J +1 [ max {f n ,k?f}] d - 1 



(5.6) 



< p [(f re ) ^'^ +d + (i + k?f) d ^(f n ) L d2 / 4 J +1 + (f^" 1 + (1 + k^f)"" 1 



where we have used ki > 1 and put = 2d 2 kp\- d2/4 \ +d . Thus, to get the assertion (iv) it suffices 
to bound from above Ej(f n ) p , p > 1. We obviously have 



f« < ^2 sup n 



and using Corollary 1 we get for p > 1 



i=l 



I,oo 



(5.7) 



4 n) (fn)y < £ [iVk^f] 5 [ 7p (|I|,koo)+{ci(p,|I|)+C 2 (p,|I|)}: 

iei d 



The assertion (iv) follows now from (5.6) and (5.7). 



5.5. Proof of Lemma 4- The proof of the lemma is based on the embedding theorem for 
anisotropic Nikolskii classes which we formulate below. 

Let (a,r) G (0,oo) s x [l,oo] s be such that k = 1 - X^/Li^n)" 1 > and let Q G (0,oo) s . Then 
there exists c > completely determined by a, r and s such that 



(5.8) 



N r , s (a,Q) 



c 



a,c 



Q); 



where a = (oti, . . . a s ), = xajxj 1 and Xj = 1 — Yli=i ( r z _ ~~ r j~ 1 ) a r 1 - 

The inclusion (5.8) is a particular case of Theorem 6.9 in Nikol'skii (1977), with p' = oo. We 
remark that Noo iS (o;, Q) is anisotropic Holder class of functions. 

Let Ej, i = 1, s be the family of s x s matrices where Ej = (ei , . . . , e^, . . . , 0) and let Eo is zero 
matrix. Putting K(u) = nf=i ui 6l s , we get for any rj £ (0, oo) s and any z S M s 



\Bt,M\ 



K(u) [g(z + «7/) - g(z)} du 



s 

i=i 



-FT('u) 5(2; + r/Eju) - g(z + ijE^u) 



du 



We note that the all components of the vectors z + r/E^u and z + rjEn^iu except i-th coordinate 
coincide. Hence using Taylor expansion we obtain any 77 £ (0, oo) s and z £ M. s in view of (5.8) 



K(u) g(z + ^Ejii) - 5(2 + ijEi^iu) 



du 



\K(u)\\u\ ai du < klcQm 



To get the last inequality we have taken into account (3.1) and used that K is supported on 
[—1/2, 1/2]. It is worth mentioning that c as a function of a is bounded on any bounded domain 
of (0,oo) s . Since the right hand side of the latter inequality is independent of z we come to the 
assertion of the lemma. ■ 
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